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Assimilating remote sensing observations of leaf area index and 
soil moisture for wheat yield estimates : An observing system 
simulation experiment 
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[i] Observing system simulation experiments were used to investigate ensemble Bayesian 
state-updating data assimilation of observations of leaf area index (LAI) and soil moisture 
( 9 ) for the puipose of improving single-season wheat yield estimates with the Decision 
Support System for Agrotechnology Transfer (DSSAT) CropSim-Ceres model. 

Assimilation was conducted in an energy-limited environment and a water-limited 
environment. Modeling uncertainty was prescribed to weather inputs, soil parameters and 
initial conditions, and cultivar parameters and through perturbations to model state 
transition equations. The ensemble Kalman filter and the sequential importance resampling 
filter were tested for the ability to attenuate effects of these types of uncertainty on yield 
estimates. LAI and 9 observations were synthesized according to characteristics of existing 
remote sensing data, and effects of observation error were tested. Results indicate that the 
potential for assimilation to improve end-of-season yield estimates is low. Limitations are 
due to a lack of root zone soil moisture information, error in LAI observations, and a lack of 
correlation between leaf and grain growth. 
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1. Introduction 

[ 2 ] Dynamic crop models, such as the Decision Support 
System for Agrotechnology Transfer crop simulation 
model (DSSAT) [Hoogenboom et al., 2004], are used to aid 
decision making under uncertainty [Jones et al., 2003]. For 
instance, DSSAT is used by the insurance industry to pre- 
dict regional crop yields on a seasonal basis. Crop simula- 
tion models have an advantage over empirical models of 
agricultural productivity in that they can react dynamically 
to changes in local conditions in a physically and biologi- 
cally meaningful way. However, because of uncertainties 
in model representations of real-world systems and because 
of uncertainties inherent in input data regarding soils, culti- 
var genetics and weather, any model-based estimate of 
agricultural yield will be subject to error. One approach to 
mitigating this type of error is to constrain model simulations 
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using remote sensing observations through a process of data 
assimilation [Liu and Gupta, 2007]. 

[ 3 ] Remote sensing measurements related to agriculture 
generally contain information about weather, vegetation or 
soil. Information about weather is used to force crop simu- 
lations directly. Remotely sensed information about vegeta- 
tion often comes in the form of a leaf area index (LAI 
(m 2 m~ 2 ) [e.g., Knyazikhin et al., 1999]), which is a crop 
model component related to canopy cover. Similarly, soil 
moisture is a model state variable that acts as the primary 
control on plant water stress and observations of volumetric 
moisture content ( 9 (m 3 m~ 3 )) in the top few centimetres of 
soil (6*i ) are available from remote sensing sources 
(AMSR-E [Njoku et al., 2003], SMOS [Kerr et al., 2010], 
and SMAP [Entekhabi et al., 2010]). Together LAI and SW 
observations provide complementary information for agri- 
cultural monitoring. 

[ 4 ] There are many types of data assimilation which are 
common in agronomy [Moulin et al., 1998; Prevot et al, 
2003]. This work investigates the potential for ensemble 
Bayesian state-updating filters [McLaughlin, 2002] to miti- 
gate modeling uncertainty on end-of-season wheat yield 
estimates. Conceptually, ensemble Bayesian filters operate 
on the principle that a probability density function (pdf) 
representing uncertainty in model states can be approxi- 
mated by a discrete set of model simulations, and that a pdf 
of model predictions can be estimated using Monte Carlo 
integration to marginalize uncertainty in model states. 
From a Bayesian perspective, the physical model provides 
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context (a prior and likelihood) for interpreting information 
contained in remotely sensed data. 

[5] Currently, a robust understanding of the response of 
physically based model estimates of agricultural yield to 
state-updating assimilation remains lacking. The first step 
in this process is to perform a controlled synthetic data 
study, also called an observing system simulation experi- 
ment (OSSE) [. Arnold and Dey, 1986], which will allow for 
an analysis of interactions between uncertainty, observa- 
tions and the model. Although both Paawels et al. [2007] 
and Pellenq and Boulet [2004] present OSSEs which inves- 
tigate the assimilation of LAI and/or 0\ into crop simula- 
tion models, these studies assess the effects of assimilation 
on model states; neither investigates the impact of data 
assimilation on yield estimates, de Wit and van Diepen 
[2007] present a case study on the effects of assimilating 9\ 
observations on yield estimates; however, this does not 
provide sufficient statistical and methodological control to 
differentiate limitations imposed by the model, the assimi- 
lation algorithm, and uncertainty in model inputs and 
observations. 

[6] We present a set of OSSEs which assess LAI and 9 
assimilation for improving DSSAT CropSim-Ceres wheat 
yield estimates in a controlled synthetic environment. This 
allows for an understanding of model response to state 
updating and a delineation of the effects of modeling uncer- 
tainty, filter error, and observation error. This investigation 
provides a benchmark for interpreting the results of case 
studies [e.g., de Wit and van Diepen, 2007] and a foundation 
on which to direct the development of agricultural models 
and remote sensing algorithms aimed at predicting yield. 

2. Methods 

[7] Several experiments are presented. First, modeling 
uncertainty was partitioned into isolated sources: weather 
inputs, soil parameters and initial conditions, cultivar param- 
eters, and model state equations. Synthesized remote sensing 
observations were assimilated using the ensemble Kalman 
filter (EnKF) [Evensen, 2003] and the sequential importance 
resampling filter (SIRF) [Gordon et al., 1993], and mean 
yield predictions from these filters were assessed. In addi- 
tion, observations with variable error characteristics were 
assimilated to test the effects of observation error on EnKF 
and SIRF results. Sections 2. 1-2.4 describe the model, data 
assimilation filters, and sets of numerical experiments. 

2.1. The DSSAT CropSim Ceres Wheat Model 

[8] DSSAT is a collection of independent crop growth 
modules supported by a land process wrapper. Integration 
takes place on a daily time step and the forcing data required 
are daily maximum and minimum temperature, daily inte- 
grated solar radiation and daily cumulative precipitation. 

[ 9 ] DSSAT soil layering is user defined; we used nine 
layers with one surface layer representing the 0-5 cm of 
soil typically assumed to be visible to L band wavelength 
satellites and a set of lower layers reaching a total depth of 
1.8 m. DSSAT soil moisture is calculated using a Richie- 
type soil water balance [Richie, 1998], which employs a 
curve number approach to partitioning runoff and updates the 
water content of each soil layer on the basis of a set of linear 
drainage equations. The soil surface parameters are : a runoff 


curve number, an upper limit on evaporation, a drainage rate 
parameter, and albedo. Soil layers are parametrized by satu- 
rated water content (porosity), drained upper limit (field 
capacity), lower limit, saturated hydraulic conductivity, and a 
root growth factor. Similar to Mo et al. [2005], we used the 
soil water balance routines but did not simulate the soil nitro- 
gen balance or any management decisions. This was done 
because it is impossible to presume that information about 
these aspects of agricultural development would be available 
at remote sensing scales. 

[ 10 ] The CropSim-Ceres module (CC) simulates wheat 
crops. The CC models yield as a function of a grain weight 
state. Grain growth is developmentally dependent on daily 
development units, which are a function of mean daily tem- 
perature and daily cumulative solar radiation, a temperature 
control factor, vegetation biomass (/? (kg/plant)) defined as 
the sum of mass storage model states stem weight, leaf 
weight, and reserves weight, and model parameters. The 
most important crop model parameters are related to the 
cultivar: vernalizing duration, which specifies the number 
of days of optimum temperature necessary for vernaliza- 
tion; photoperiod response, which specifies the percent 
reduction in photosynthesis for every 10 h reduction in pho- 
toperiod; grain filling phase duration in growing degree- 
days (°C days); number of kernels per unit plant weight 
(number g 1 ) ; the standard kernel size (mg); the standard 
tiller weight (g) ; and the photoperiod interval between leaf 
tip appearances. 

[ 11 ] In contrast to grain weight, LAI is a function of the 
model state plant leaf area which is developmentally de- 
pendent on a temperature control factor and has a potential 
value set by the number of plant leaves, which is in turn 
determined at each time step by the cumulative sum of 
daily development units. Again, in contrast to grain growth, 
potential daily leaf growth is attenuated by an additive fac- 
tor proportional to water stress, S w € [0,1], so that a stress 
factor of 0 indicates potential growth and a stress factor of 
1 indicates no growth. Potential grain growth is not modi- 
fied in this way; the other components of biomass are 
affected indirectly by stress through leaf assimilation of 
plant carbon reserves. Water stress is the ratio of total root 
water uptake to potential transpiration, which is a fraction 
of potential evapotranspiration calculated according to the 
Priestley-Taylor method [Priestley and Taylor, 1972], Root 
water uptake from each layer is a function of the difference 
between soil moisture state and the lower limit parameter. 
Thus when sufficient soil moisture is available to supply 
transpiration demand, water stress is zero. Given the way 
model develops vegetation and grain components of the 
wheat plant, we know that LAI and 9 will inform yield by 
informing /?. 

[ 12 ] The model state vector contains all of the internal 

dynamic model variables necessary to transition the simu- 
lation from one time step to the next, that is, all of the 
Markov information. More specifically, at given time t, 
state (x,) is a function of the state at the previous time 
(x f _i), forcing data at the current time (u,), and time-invari- 
ant model parameters ( 7 ?) according to the state transition 
relationship : 

x t = M.i(im,Ui,i)). (la) 
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The combined land process wrapper and CC Markov state 
vector has 97 components (Tables 1 and 2). The model 
output vector at time t (y, ; here we use the term output to 
refer to model predictions which correspond directly with 
observations) is calculated according to the relationship 
M y {-) as a function of the current state, current inputs and 
parameters : 

y, = M y (x. t , u,, $). (lb) 

For our purposes, the output vector contained the quantities 
#i :9 (soil moisture in each of 9 soil layers) and LAI; 0 i-.g 
are also state variables so Af v (-) simply preserves these 
values through identity relationships. LAI is not a state 
variable because its value is calculated independently at 
each time step as a function of the state plant leaf area. 

2.2. The Ensemble Kalman Filter 

[ 13 ] The EnKF is commonly used for state updating in 
moderately nonlinear geophysical models [Reichle, 2008]. 
It estimates the model state pdf by drawing N ens samples 
from a joint uncertainty pdf over model parameters, forcing 
data, and state perturbations, and then propagates this sam- 
ple through time using the model equations. This set of 
model simulations is the ensemble. At every observation 
time, the EnKF updates the state pdf on the basis of the 
assumptions that all model states are linearly related to 
model output and that uncertainty in model states, model 
output, and observations can be quantified by second-order 


Table 1. The Decision Support System for Agrotechnology 
Transfer (DSSAT) Markov State Vector (33 Components) 


Component 

Description 

Units 

ATOT 

Sum of last 5 days soil temperature 

c 

CANHT 

Canopy height 

m 

DRN 

Drained soil water 

cm 

EO 

Potential evaporation 

mm d~ 1 

EOP 

Potential plant transpiration 

mm d~ 1 

EORATIO 

Increase in evaporation per unit LAI 

mm d~ 1 

EOS 

Potential soil evaporation 

mm d' 1 

EP 

Plant transpiration 

mm d~ 1 

ES 

Soil evaporation rate 

mm d _ 1 

FRACRTS 

KSEVAP 

KTRANS 

PORMIN 

Fraction of soil contact w/roots 
Light extinction coefficient (evaporation) 
Light extinction coefficient (transpiration) 
Minimum pore space for O 2 to plants 

m 3 m- 3 

RLV 

Root volume by soil layer 

m 3 m -3 

RWUMX 

Maximum uptake per unit root length 

m 3 m -3 

SNOW 

Snow accumulation 

mm 

SRFTEMP 

Surface soil temperature 

°C 

SSOMC 

Soil Carbon 

kg ha - 1 

ST 

Soil temperature 

°C 

STGDOY 

Stage transition dates 

day 

SUMES1 

Cumulative stage 1 soil evaporation 

mm 

SUMES2 

Cumulative stage 2 soil evaporation 

mm 

SW 

Soil water 

m 3 m -3 

SWDELTS 

Drainage rate 

m 3 m- 3 

SWDELTU 

Change in SW due to evaporation 

m 3 m -3 

SWDELTX 

Change in SW due to plant uptake 

m 3 m -3 

TMA 

Last 5 days of soil temperature 

°C 

TRWU 

Total root water uptake 

mm 

TRWUP 

Total potential root water uptake 

m 3 m- 3 

TSOILEV 

Duration stage 2 evaporation 

days 

TSS 

Number of days with saturated soil 

days 

UPFLOW 

Upward flow due to evaporation 

cm 

WINF 

Infiltration 

mm 


pdf approximations. Because the method has been widely 
discussed, we only present a brief overview and follow a 
variation on the formulation of Houtekamer and Mitchell 
[ 2001 ], 


Table 2. The CropSim-Ceres Wheat Module Markov State Vector 
(64 Components) 


Component 

Description 

Units 

ADATEND 

Anthesis end date 


AFLFSUM 

Carbohydrate leaf factor 


CARBOC 

Cumulative carbohydrate assimilated 

g/plant 

CHRSWT 

Chaff reserves 

g/plant 

CHWT 

Chaff weight 

g/plant 

CUMDU 

Cumulative development units 

°C days 

CUMGEU 

Cumulative germination units 

°C days 

CUMVD 

Cumulative vernalization days 

days 

DAE 

Days after emergence 


DEADWT 

Dead leaf weight retained on plant 

g/plant 

G2 

Coefficient of grain growth (modified) 

mg/d °C 

GEDSUM 

Germination plus emergence duration 

days 

GESTAGE 

Germination, emergence stage 


GET SUM Germination plus emergence temperature sum 

°C 

GPLA 

Green leaf area 

cm 2 /plant 

GPLASENS 

Green leaf area during senescence 

cm 2 /plant 

GRNUM 

Grains per plant 

number/plant 

GRWT 

Grain weight 

g/plant 

ISTAGE 

Current developmental stage 


ISTAGEP 

Previous developmental stage 


LAGSTAGE 

Lag phase for grain filling stage 


LAP 

Leaf area by leaf number 

cm 2 /plant 

LAPOT 

Leaf area potentials by leaf number 

cm 2 /leaf 

LAPS 

Leaf area senesced by leaf number 

cm 2 /plant 

LFWT 

Leaf weight 

g/plant 

LLRSWAD 

Leaf lamina reserves weight 

kg ha -1 

LLRSWT 

Leaf lamina reserves 

g/plant 

LLWAD 

Leaf lamina weight 

kg ha -1 

LNUMSD 

Leaf number per Haun stage 


LNUMSG 

Growing leaf number 


LSHAI 

Leaf sheath area index 

2 -2 

m m 

LSHRSWAD 

Leaf sheath reserves weight 

kg ha -1 

LSHRSWT 

Leaf sheath reserves 

g/plant 

LSHWAD 

Leaf sheath weight 

kg ha -1 

PARI 

PAR interception fraction 


PLA 

Plant leaf area 

cm 2 

PLTPOP 

Plant Population 

plants/m 2 

RSTAGE 

Reproductive development stage 


RSWT 

Reserves weight 

g/plant 

RTDEP 

Root depth 

cm 

RTWT 

Root weight 

g/plant 

RTWTL 

Root weight by layer 

g/plant 

SEEDRS 

Seed reserves 

g/plant 

SEEDRSAV 

Seed reserves available 

g/plant 

SENCL 

Senesced Carbon by layer 

g/plant 

SENCS 

Senesced Carbon added to soil 

g/jdant 

SENLA 

Cumulative senesced leaf area 

cm 2 /plant 

SRADSUM 

Cumulative radiation 

MJnU 2 

SSTAGE 

Secondary stage of development 


STRSWT 

Stem reserves 

g/plant 

STWT 

Stem weight 

g/plant 

TNUM 

Tiller number 

number/plant 

TSDAT 

Terminal spikelet date 


TSS 

Duration of saturation 

days 

TTD 

Thermal time over last 20 days 

°C 

TTNUM 

Thermal time means in sum 

°C days 

VF 

Vernalization factor 


WFG 

Water stress factor for growth 


WFGC 

Cumulative growth water factor 


WFLFNUM 

Water stress factor for each leaf 


WFLFSUM 

Cumulative water stress factor per leaf 


XSTAGE 

Stage of development 


ZSTAGE 

Zadok stage of development 


ZSTAGEP 

Precious Zadok stage 
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[ 14 ] The ensemble of model state predictions at time t is 
stored in X t = [xyi, x t , 2 , ■ ■ ■ , Xt,N ens }, which has size 
D stt x N ens where D stt is the dimension of x t . Similarly, the 
ensemble of model outputs is Y t = \y t .\ , y tj 2 , . . . , jyy.J, 
which has dimensions D obs x N ens where D obs is the dimen- 
sion of the observation vector, z t . The observation error co- 
variance, R h is required a priori and an observation sample 
Z t = [C M , Cr 2 ) • • • ) C( is generated according to 

Ct,i ~ N[z t , R t ]. (2) 

The ensemble of EnKF updated model states, X t , is calcu- 
lated as a least squares estimate based on model predictions 
and observations resulting in 

Xt = X t + Pt(Qt + R t ) _1 (Z t — Y t ) (3a) 

where 

(X t — E[X t ])(Y t — E[Y t ]) r 

‘ (Nens 1) 

is the cross covariance between ensemble deviations from 
mean model state and deviations from mean output and 


(root accumulation is stored as a volumetric fraction rather 
than as a mass) as well as canopy height. Our EnKF 
employed a threshold filter which discarded any relation- 
ship between model states and modeled observation com- 
ponents with a Pearson product-moment correlation 
coefficient |p| < 0.3. This reduced the possibility of pick- 
ing up spurious correlations. 

[ 16 ] It is important to note that the CC is a set of step 
functions which calculate crop attributes in fundamentally 
different ways depending on the current stage of develop- 
ment. Because of these and other nonlinearities, the EnKF 
will not guarantee mutually consistent model states after 
each update since it uses a single correlation relationship to 
update every ensemble member regardless of the current 
growth stage of each particular simulation. 

2.3. The Sequential Importance Resampling Filter 

[ 17 ] The SIRF provides an approximate Bayesian esti- 
mate of model state uncertainty at each time step condi- 
tional on past observations without assumptions of linearity 
and second-order statistics. At each observation time, each 
ensemble member’s state vector is assigned an importance 
weight, W", which is proportional to the posterior likeli- 
hood of that state vector conditional on all past observa- 
tions : 


(Y t -E[Y t ])(Y,-E[Y t ]) r 
(Nens 1) 


(3c) 


W” oc P( 


X, |z 1:t ) = - 


P(z t | x”,zi :t _i)P(x"|zi :t _i) 

P(z.) 


(4a) 


is the covariance matrix of ensemble deviations from mean 
model output. Both P t and Q t are sampled directly from 
the ensemble. 

[ 15 ] The finite nature of ensemble representations of 
uncertainty can lead to spurious updates when X t contains 
components that are not approximately or locally linearly 
related to components of Y t . An analysis of DSSAT state 
and observation correlations resulted in a list of important 
Markov state components which have local approximately 
linear relationships with one or more output (Table 3, 
except stage timing states). This list includes all CC plant 
mass storage components, plant leaf area, and root volume 


Table 3. The State Vector Updated by the EnKF and the State 
Vector Perturbed by Equation (5a) to Simulate Model Structural 
Error 3 


State Component 

Units 

Dimensions 

CSM states 

Soil water 

m 3 m - 3 

9 

Canopy height 

m 

1 

CC states 

Root volume fraction 

cm 2 cm -3 

9 

Chaff weight 

g/plant 

1 

Stem weight 

g/plant 

1 

Leaf weight 

g/plant 

1 

Reserves weight 

g/plant 

1 

Grain weight 

g/plant 

1 

Plant leaf area 

cm 2 

1 

Seed reserves 

g/plant 

1 

Stage timing states 

Cumulative development units 

°C days 

1 

Cumulative germination units 

°C days 

1 


a Cumulative development units and cumulative germination units are 
not updated by the ensemble Kalman filter (EnKF), and grain weight is not 
perturbed by equation ( 5 a). 


superscripts index the ensemble member. The observations 
are assumed to be independent conditional on the model 
output, and model output is a deterministic function of 
model state according to (lb) so that the likelihood function 
relates observations to model state vectors according to 

P( z t|x",z 1:t _ 1 ) = P(z t |y"). (4b) 

P(z t |y") is emulated by the observation uncertainty pdf, in 
this case Gaussian with mean y" and covariance R t . The 
state prior, P(xJ*|z 1:t _ 1 ), is estimated discretely by X t by the 
ensemble in the same way as the EnKF ; this is achieved 
for time step t + 1 by resampling the ensemble at time step 
t with replacement and with probabilities proportional to 
{W”} resulting in an iid discrete representation of the pos- 
terior, P(x“|z 1:t ). X t+1 , as calculated from X t by equation 
(la), thus contains an iid discrete representation of the prior 
at time step t + 1, P(x" , | z ( :t ) . Proportional probability 
weights are simply calculated as 

K = P(zt|y t ”) = exp (- l - (z t - y”) T Rf 1 ( z t - y, n ) j • (4c) 

In our case, the simulation ensemble was updated by 
replacing all 97 components of each member’s state vector 
(Tables 1 and 2) with a state vector from a different simula- 
tion (sampling with replacement). Each model retained its 
own parameters and forcing data. 

2.4. Observing System Simulation Experiments 

[is] OSSEs, as diagrammed in Figure 1, were used to 
assess LAI and 9 assimilation potential. A group of N ens + 1 
simulations (ensemble size is discussed in section 2.4.3) was 
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Figure 1. Observing system simulation experiment process diagram: Transparent gray boxes represent 
sequential importance resampling filter (SIRF) and ensemble Kalman Jilter (EnKF) assimilation algo- 
rithms, u are forcing data, 9 are model parameters, X are model states, X are filter-updated model states, 
Y are modeled leaf area index (LAI) and 0, Z are observed LAI and 9, and VD is yield. R, £$, and £ u are 
uncertainty variances listed in Tables 4 and 5 and equation (5a). 


sampled from a modeling uncertainty pdf. One of these 
simulations was chosen randomly as the truth system 
(upper path in Figure 1) leaving an A^-member predic- 
tion ensemble which was used to estimate yield with and 
without data assimilation. Synthetic observations were 
generated by sampling from an observation uncertainty 
distribution around the truth system output and assimilated 
by the EnKF and SIRF (middle paths in Figure 1). The en- 
semble of model simulations without data assimilation 
was called the open loop (bottom path in Figure 1). The 
open loop, EnKF and SIRF all used the same truth system 
and ensemble members (parameters, initial conditions and 
weather forcing data) and the EnKF and SIRF used the 
same synthetic observations. 

[ 19 ] This type of OSSE was used to (1) choose an appro- 
priate ensemble size, (2) test the effects of EnKF and SIRF 
assimilation on segregated and combined modeling uncer- 
tainty sources, (3) test the effects of observation uncertainty 
on EnKF and SIRF assimilation. Experiments 2 and 3 used 
the ensemble size chosen by experiment 1. In all cases 
except for when determining ensemble size, each OSSE was 
repeated fifty times by drawing separate truth systems and 
ensembles; this Monte Carlo repetition provided a statisti- 
cally independent experiment sample. Sections 2.4. 1-2. 4. 5 
describe the modeling uncertainty pdf, the procedure for 
generating synthetic observations from truth system output, 
and these sets of experiments. 

2.4.1. Modeling Uncertainty Distributions 

[ 20 ] Assimilation was tested on two rain-fed wheat crops 
with different levels of water stress. Mean parameter and 
weather inputs came from field experiment data which are 
packaged with the DSSAT version 4.5 release: a 1975 


study of a summer wheat crop conducted in Swift Current, 
Saskatchewan, Canada, reported by Campbell et al. [1977a, 
1997b] and a 1974-1975 study on winter wheat conducted 
in Rothamsted, UK. Figure 2 plots LAI, grain weight, and 
water stress for both crops simulated using mean parame- 
ters outlined in Table 4 and unperturbed weather forcing 
data. 

[ 21 ] The Swift Current summer wheat crop represents a 
water-limited system and yielded 104 kg ha" 1 using mean 
parameters listed in Table 4; the potential, nonstressed 
yield was 4266 kg ha -1 . The mean parameter and forcing 
system received 153.6 mm of rainfall over a total of 95 
days from planting on 25 May 1975 to maturity on 28 Au- 
gust 1975 and had a total evapotranspiration of 151.9 mm. 
The water stress factor reached a high of S w = 0.97 during 
the ear growth stage which occurred between day 51 and 
day 61 after planting. Although this crop produced very lit- 
tle yield using the mean parameter and input values, it was 
often the case that simulations sampled from the parameter 
and input uncertainty distribution caused a substantial 
increase in yield. 

[ 22 ] The Rothamsted winter wheat crop represents an 
energy-limited system and reached potential yield of 665 1 
kg ha 1 using mean parameters listed in Table 4. The sys- 
tem received 512.9 mm of rainfall over a total of 269 days 
from planting on 6 November 1974 to maturity on 6 August 
1975 and had a total evapotranspiration of 381.2 mm. The 
water stress factor reached a high of S w = 0.84 during the 
grain filling stage which occurred between day 240 and 
harvest; the water stress factor was close to 0 during all 
other development stages. Although this crop produced 
potential yield using the mean parameter and input values, 
it was often the case that simulations sampled from the 
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Figure 2. Baseline simulations of water-limited (Swift Current) summer wheat and energy-limited 
(Rothamsted) winter wheat with parameters listed in Table 4. In the Rothamsted plot, water stress is 
magnified by a factor of 8. 


parameter and input uncertainty distribution caused a sub- 
stantial decrease in yield. 

[ 23 ] Weather forcing data uncertainty was emulated by 
perturbing daily measured weather data with values 
sampled form the temporally and cross-correlated joint pdf 
outlined in Table 5. Perturbations on solar radiation and 
precipitation were multiplicative lognormally distributed 
with a mean of 1 and standard deviations of 0.3 and 0.5, 
respectively; perturbations on temperature were additive 
Gaussian with a mean of 0 and unit variance, and the same 
daily perturbation was applied to daily maximum and mini- 
mum temperature. Weather perturbations were cross corre- 
lated and AR(1) (first-order autoregressive) temporally 
autocorrelated with correlation coefficients 1/e following 
Reichle et al. [2007, 2010] ; since integration was on a daily 
time step, these autoregression coefficients are relevant to a 
daily time series. 

[ 24 ] Parameter uncertainty distributions were Gaussian 
(approximately, due to bounds) with means, variances, and 
bounds listed in Table 4. Cultivar parameters are model 
specific; parameter files included with DSSAT release ver- 
sion 4.5 provided the limits and variances listed in Table 4. 
Variances and bounds for surface soil parameters were also 
estimated using a library of soil parametrizations included 
with DSSAT version 4.5. We used lumped parameters for 
the bottom eight soil layers because of a presumed lack of 


knowledge about subsurface soil properties. The root 
growth factor was assumed to decrease exponentially with 
depth and was parametrized by a maximum value at the 
surface. Porosity, saturated conductivity and residual satu- 
ration were calculated from clay and silt percentages using 
pedotransfer functions from Cosby et al. [1984], Bulk den- 
sity was calculated as a function of porosity assuming min- 
eral density of 2.65 g cm -3 and drained upper limit was 
taken as the average of saturated and residual moisture con- 
tents. Global soils maps that provide clay and silt contents 
are not usually associated with useful error estimates 
because much of the error in soil mapping is due to sparse 
measurements of heterogeneous areas. Here we sampled 
sand and clay percentage parameters independently, each 
with a standard deviation of 10%; sand and clay percen- 
tages were constrained to be positive and the sum was con- 
strained to be less than one by, when necessary, reducing 
both parameters by an equal amount. Given the mean sand 
and clay parameters from Table 4, this resulted in 95% con- 
fidence bounds which spanned approximately 18% of the 
soil textural triangle at and Swift Current and 25% of the 
soil textural triangle at Rothamsted. 

[ 25 ] Model structural uncertainty was simulated by add- 
ing noise to the model state transition equations : 

x, = M x (tl,- 1 , U(, •&) + t t \ e, ~ N[0, £,] (5a) 
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Table 4. The (Approximately) Gaussian Probability Density Function of Uncertainty in Model Parameters and Initial Conditions 

Uncertainty 


Mean Parameters Values 


Uncertainty Source 

Swift Current 

Rothamsted 

Standard Deviation 

Lower Bound 

Upper Bound 

Units 

Vernalizing duration 

0 

60 

Cultivar Parameters 
3 

0 

60 

days 

Photoperiod response 

60 

67 

10 

0 

200 

% 

Grain filling duration 

335 

515 

33.5 

100 

1000 

°C days 

Kernel number 

25 

14 

2.5 

10 

50 

number/g 

Standard kernel size 

26 

44 

2.6 

10 

80 

mg 

Standard tiller weight 

1.5 

4.0 

0.3 

0.5 

8 

g 

Interval between leaves 

86 

100 

10 

30 

150 

°C days 

Albedo 

0.10 

0.14 

Soil Parameters 
0.05 

0 

1 


Upper limit evaporation 

9.4 

6.0 

2.0 

1 

12 

cm d _l 

Drainage rate parameter 

0.20 

0.50 

0.3 

0.01 

0.99 

1/d 

Runoff curve number 

84.0 

60.0 

10 

1 

99 


Root growth factor 
Layer 1 (0-5 cm) 

1.00 

1.00 

0.05 

0 

1 


Layers 2-9 (5-1 80 cm) 

0.74 

0.90 

0.1 

0 

1 


Percent clay 
Layer 1 (0-5 cm) 

10.7 

23.4 

10 

0 

100 

% 

Layer 2-9 (5-1 80 cm) 

9.2 

23.4 

10 

0 

100 

% 

Percent silt 

Layer 1 (0-5 cm) 

29.9 

30.0 

10 

0 

100 

% 

Layers 2-9 (5-1 80 cm) 

29.7 

30.0 

10 

0 

100 

% 

Initial soil moisture 
Layer 1 (0-5 cm) 

0.23 

0.33 

0.04 

Lower limit 

Saturation 

m 3 m- 3 

Layers 2-9 (5-1 80 cm) 

0.20 

0.33 

0.08 

Lower limit 

Saturation 

m 3 m~ 3 


This type of model error was added to those states listed in 
Table 3 except for grain weight, and in some cases, where 
specified, to the development unit stage timing states cumu- 
lative development units and cumulative germination units. 
It was found that perturbing the grain weight state caused 
irreconcilable yield error by weakening statistical relation- 
ships between observations and yield. Random state pertur- 
bations were drawn from zero-mean Gaussian distributions 
with heteroscedastic variances 

E I = I Ds „0.02(x t _ 1 ), (5b) 

where Id s „ is the D stt dimension identity matrix. This 
ensured that no state would become finite purely because of 
the perturbation, and threshold filters were used to ensure 
that all state values remained nonnegative. Perturbations 
were sampled independently across states and time steps. 

2.4.2. Generating Synthetic Observations 

[ 26 ] The truth system output vector (yf u>h = 

LA I tm 'h } ) was used to generate synthetic observations, z t . 


At every observation time, a remote sensing measurement 
process was simulated by drawing from the Gaussian 
distribution 


zt ~ N[y t , R,\: R, 


' R ? 
o 


0 

R L.4! 


(6a) 


R^ AI and R e t represent the observation error covariance 
matrixes related to LAI and 9 observations, respectively. 
Synthetic observations have no spatial scale. 

[ 27 ] Frequency and error properties of synthetic observa- 
tions were guided by uncertainty in existing remote sensing 
data. Observations of 9\ are available from SMOS at most 
major agricultural areas every 3 days with a spatial resolu- 
tion of 50 km and approximate retrieval accuracy of error 
~ ±0.04 m 3 m~ 3 [Kerr et al., 2010]. An improvement in 
spatial resolution (to 9 km) is expected with the launch of 
SMAP in 2014 [Entekhabi et al., 2010], Measurement 
accuracy will degrade as vegetation water content, 9 veg 
(kg rrf ), increases throughout the growing season; in the 


Table 5. Forcing Data Perturbation Sampling Parameters 


Weather Inputs 

Multiplicative 
or Additive 

Standard 

Deviation 

AR(1) 

Coeffienf 


Correlations 


Data 

Units' 3 

Temperature 

Radiation 

Precipitation 

Temperature (maximum and minimum) 

A 

i 

1/e 

i 

-0.80 

-0.32 

°C 

Solar Radiation 

M 

0.3 

1/e 

-0.80 

1 

0.40 

MJ nT 2 d 

Precipitation 

M 

0.5 

1/e 

-0.32 

0.40 

1 

mm 

a First-order autoregressive coefficients 

assume a daily time 

series. 







b Data units are the dimensions of the forcing data itself and not the units of the perturbations except in the case of temperature which uses additive per- 
turbations; radiation and precipitation perturbations are multiplicative and unitless. 
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case of SMAP, observations at or better than the ±0.04 
m 3 m 3 accuracy level are expected to be confined to areas 
with vegetation water content less than 5 kg m -2 [Ente- 
khabi et al., 2010]. Jackson and Schmugge [1991] devel- 
oped a relationship between vegetation water content and 
vegetation transmissivity, as proposed by Kirdiashev et al. 
[1979], which Bolten et al. [2010] adapted to model soil 
moisture observation uncertainty as 

^ =J?oeXp (2^fe)' (6b) 

R 0 (m 6 /m 6 ) is the variance of the soil moisture retrieval at 
time t made over vegetation, R 0 (m 6 /m 6 ) is the variance of 
estimates made over bare soil, b is an environment parame- 
ter which accounts for vegetation type and roughness, and 
<t> (degrees) is the satellite incidence angle. The SMAP inci- 
dence angle is 40° and we adopted the value of b = 0.12 
for agricultural crops [Crow et al., 2005], along with a bare 
soil observation error standard deviation of ( R 0 ) 1//2 = 
0.027 m 3 m ~ 3 resulting in a retrieval accuracy model which 
approaches (R 0 ) 1 ^ 2 = 0.04 m 3 m ~ 3 as vegetation water 
content approaches 9 veg = 5 kg m -2 . Vegetation water con- 
tent was assumed to be a constant fraction of plant biomass 
and plant population (p pop (plants/m 2 )) : 

«.-(£?> «*> 

a = 0.75 was set according to results reported by Malhotra 
[1933], Plant population is a CC component that is not 
included in the EnKF update. 

[ 28 ] Synthetic observations of 9\.$ were generated every 
3 days, each layer perturbed independently with the same 
statistical error characteristics. Remote sensing platforms 
are only able to measure soil water content in the upper 
few centimetres of soil, and in section 3.2 we compare 
assimilations which used only 9\ observations with assimi- 
lations which used observations of 9 1 . 9 . 

[ 29 ] The MODIS MOD15A2 product group provides LAI 
(m 2 nT 2 ) estimates as an 8 day composite. The accuracy 
and uncertainty in this product has been investigated over 
agricultural areas by comparing the composite image to 
reference LAI from a single day at the end of the composite 
period; the uncertainty standard deviation was reported to be 
(7 ^') 1/2 < 0.38 m 2 nr 2 [Tan et al, 2005], We generated 
synthetic LAI observations with [R\ ai ) 1 ^ 2 = 0.38 every 
8 days to simulate the remote sensing measurement process. 

2.4.3. Ensemble Size Experiments 

[ 30 ] Ensemble size represents a balance between pdf rep- 
resentation and computation expense. Effects of varying 
ensemble size were evaluated using ensembles of N ens = 
10, 25, 50, 75, 100, 250, 500, 750, and 1000. An appropri- 
ate ensemble size was found when pdf predictions of grain 
weight, LAI and <)\ became stable with increasing sample 
size. The quality of ensemble representations of outputs 
LAI and 9\ and state grain weight was quantified using the 
root mean square error (RMSE) taken over all simulation 
time steps of the difference between mean ensemble pre- 
dicted output values and the true value. 


[ 31 ] To make direct comparisons between OSSEs with 
different ensemble sizes, it was necessary to use the same 
truth system and observation set. Four truth systems were 
chosen and, for each truth system, a corresponding set of 
observations were generated and an N em = 1 , 000 member 
ensemble was sampled from the full uncertainty pdf out- 
lined in Tables 4 and 5 and equation (5a). For each truth 
system, each ensemble of increasing size completely con- 
tained all smaller ensembles. For example, for a given truth 
system, the N ens = 25 member ensemble contained the 
N em = 10 member ensemble plus 15 additional simula- 
tions. The RMSE averaged over these four experiments is 
reported, and the particular choice of ensemble size is dis- 
cussed in section 3.1. 

2.4.4. Modeling Uncertainty Experiments 

[ 32 ] Once an appropriate ensemble size was chosen, 
experiments were conducted to test the ability of data 
assimilation to mitigate particular types of modeling uncer- 
tainty in yield estimates. Modleing uncertainty pdf were 
taken as marginal distributions of the entire joint uncer- 
tainty distribution (Tables 4 and 5 and equation (5a)). We 
tested the full joint uncertainty pdf described in section 
2.4.1 plus five marginal uncertainty pdf related to (1) 
weather forcing data (Table 5), (2) soil parameters and ini- 
tial conditions (Table 4), (3) cultivar parameters (Table 4), 
(4) model state perturbations (equation (5a)) to states listed 
in Table 3 except for grain weight, cumulative development 
units and cumulative germination units, and (5) same as 
point 4 but with perturbations to cumulative development 
units and cumulative germination units. Because each uncer- 
tainty type was independent from all others, marginalizing a 
particular uncertainty component was done by setting all of 
the variances of its uncertainty components to zero. 

[ 33 ] Assimilation OSSEs were run for simulations of 
water-limited (Swift Current) and energy-limited (Roth- 
amsted) crops using four types of observation sets: LAI 
and 9 1 , LAI only, 9\ only, and 9i^\ the first three represent 
what are available from satellites and the fourth provides a 
way to assess limitations of only having surface level soil 
moisture information. Each OSSE was repeated fifty times 
with different truth systems, ensembles and observations. 
The results were evaluated using a mean error score (ME 
(kg ha -1 )), which is the absolute difference between the en- 
semble mean predicted end-of-season yield and the true 
yield. This was calculated for the open loop, EnKF, and 
SIRF ensembles for each of 50 OSSEs, and a single-tailed, 
two-sample t test ( a = 0.05) was used to test for a signifi- 
cant reduction in mean ME score due to SIRF or EnKF 
assimilation. 

[ 34 ] Yield estimates can only be expected to improve 
when there are strong (although possibly indirect) relation- 
ships between model outputs and grain weight. Since grain 
weight is related to LAI and water stress through biomass, 
we report the absolute time-averaged Pearson product- 
moment correlation coefficients (|p|) between model out- 
puts and (3. For 0\ $ the sum of profile water was used. Sta- 
tistics were calculated using all open loop ensemble 
members from the 50 combined uncertainty OSSEs. 

2.4.5. Observation Uncertainty Experiments 

[ 35 ] Eight sets of OSSEs which utilized the full model- 
ing uncertainty pdf were used to test effects of observation 
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uncertainty on assimilation results. Synthetic observations 
of LAI and 9 \ were generated and assimilated every 3 and 
8 days as described in section 2.4.2; however, 9 error var- 
iances were (R e ) (l/2] = 0.001, 0.005, 0.010, 0.015, 0.020, 
0.030, 0.040, and 0.050 (m 3 nr 3 ) and LAI error variances 
were ( R LAI = 0.01, 0.02, 0.05, 0.10, 0.15, 0.20, 0.30, 
and 0.40 for the eight sets of OSSEs. For both water-lim- 
ited and energy-limited crops each observation type and 
uncertainty level was tested using fifty Monte Carlo OSSEs 
and the reduction in ME scores was used to evaluate EnKE 
and SIRF performance. 

3. Results 

3.1. Ensemble Size Experiments 

[36] Figure 3 illustrates RMSE output and grain weight 
dynamics due to varying ensemble size averaged over four 
sets of Swift Current OSSEs. Generally, when LAI was 
assimilated LAI RMSE improved but not 9\ RMSE, and 
when 9\ was assimilated 9\ RMSE improved but not LAI 
RMSE; this is similar to the findings of Pauwels et al. 
[2007], who used the EnKF and observed little improvement 
to modeled LAI when 9\ observations were assimilated. In 
both cases where outputs improved (LAI RMSE for LAI 
assimilation and 9\ RMSE for 9\ assimilation) output 
RMSE values were relatively stable with increasing ensem- 
ble size after about N ens = 25 for both the EnKF and SIRF. 


In the two cases where outputs did not improve (6\ RMSE 
for LAI assimilation and LAI RMSE for 9 1 assimilation) 
output RMSE values stabilized around N ens = 100. de Wit 
and van Diepen [2007] used the EnKF and found stability 
in mean-square error of modeled LAI with EnKF assimila- 
tion of 9\ at around N ens = 100, which is in agreement with 
our findings. The grain weight state RMSE generally 
improved when LAI observations were assimilated but not 
9\ observations; in both cases, the grain weight RMSE 
became relatively stable around N ens = 100. The remainder 
of the OSSEs discussed in this paper used an ensemble size 
of N ens =100. 

3.2. Modeling Uncertainty Experiments 

[ 37 ] Table 6 reports Monte Carlo average ME scores and 
time-averaged correlations between outputs and end-of- 
season yield for water-limited and energy-limited OSSEs. 
The results for each uncertainty type are described below. 
It is important to understand that the time-averaged correla- 
tions between model outputs and biomass inform assimila- 
tion results, but that the relationship is indirect and 
situation dependent. Thus the \p\ coefficients in Table 6 
cannot be compared directly. 

3.2.1. Weather Inputs 

[38] Precipitation uncertainty affects grain development 
indirectly via the water stress control on the leaf weight 
component of biomass, while radiation and temperature 


LAI Assimilation 8 i Assimilation 








Figure 3. Output and grain weight time-averaged open loop, EnKF, and SIRF root-mean-square error 
values as a function of increasing ensemble size. 


9 of 13 


W05525 


NEARING ET AL. : ENKF AND SIRF OSSES FOR WHEAT YIELD 


W05525 


w 

CZ5 

C /3 

O 

.c 

S3 

t5 

<L> 

O 

c 

OX) 

.g 

-o 

o 


o 

U 

C3 

O 


o 

U 

-o 


-o 


W 


<! 


u 


pfi 

03 

H 


Ph 

2 


d- o 
O hJ 


U 


<N m - O 
cn no oo 
ir> <n r-; 
o o o o 


o oo no m 

on oo oo 
-t O (N| H 

o o o o 


OO OO 't VN 
'si- -H VO Tt 

in on on oo 
o o o © 


fh cn vq vq 
ri o\ i- - h 


r- r- o tj- 


y—< cn on i/i 
on in on 
^ r- o r- 
<N cn cn Tt 


H o O; N 
O^ 't h 

— <n — r- 
CN -xl- Tt 


q ^ o t 
on no no no 
on oo on 
<n no m <n 


O On on rq 
6 h fi iri 
oo Os t-~ *n 
fN fN ro n 


<N on rq © 
•n — ‘ Tf 

m t— i vo oo 
cn co m Tt 


CN fO CN © 

o F 'd o 
oo oo vo 
cn cn cn Tt 


r- vq in oo 
cn cn in tj- 
'xt- cn r- o 
cn cn m vo 


T3 00 

C- C 3 

3 2 


c 0 


§ 

.a 


•- '$ 
15. 

§ a s 

M « .2 

o c (5 

■p B n 

« Ti 

* 1 P'1 
I i.g & 

ea 27-4= £ 

'O = £ 
? t/1 U M 


c- 

-o 2 
<u c 

e <-> 

~ <K 

<D ^ 
cS C/3, 


on vo in 

in cn in ^ in o 

q o o o 

© © © © © © 


(N vo cn co o o 
HHTtoqo 
© © o o © o 


h 00 OO On O -H 

in o o h ^ 

vq in oo oo cq 

cd cd cd cd cd cd 


O CO 

q cn 
cd cd 


CO CN Tf 1 
>n cn Tf 
vo r- cn 


r- vo cn 

CO t CN 1 
VO t CO 


— in co . ( 
cd CN in °) 2 i 
co in — o > 
n vo co ij ( 


: n- 

: a 


ON oo o 
OO CO CO 1 
in h co 


~a 

c 

c 


£ M 

S CL) 
.2 ° 

ea 5 
Q o 
Id ca 
£ -o 
X-> o 
bp o C 

.5 

s s s 

'V S O 

55 o 


cn n 
in 

q vq 
cd cd 


CO o 
00 t'-' 
CO 00 


b/j 

c3 


O 

o 

^3 


■o M ? 
is *-< w 

§S g 
» 2 -2 
5 S ca 
S >5 •fi 

i M 

3 2 £ a> 

= M 5 & 

1 n, >> • 

3 j7'42 3 

; o 3 3 • 

- OO O 00 


T 3 

-2 £ 
‘P C« 

•S E 

ca 

>%•£ 
g? o 

s & 

C ^ 

w 


< 

hJ 


bp 

bO 


od 

00 


CN CN 
in cn 


vq vq 
'd cd 


in 


oo 

O 


ON OO 

N" 

OO CN 
O CN 


N" CN 
CO 

O CO 


’p 


affect grain development directly as well as indirectly 
though biomass. In simulations of the water-limited crop, 
LAI and 9 were correlated with /3 at approximately 
\p\ = 0.5. ME scores were not significantly improved by 
assimilating LAI or 6\ using either filter. The inefficiency 
in assimilating LAI observations can be attributed to differ- 
ences in the way radiation and temperature affect leaf and 
grain development and the fact that uncertainty in radiation 
and temperature affected grain growth after leaves were 
senesced (Figure 2). 

[ 39 ] The SIRF was able to reduce the average ME score 
by assimilating 0 1:9 and was more successful than the 
EnKF, in this case likely because of the EnKF’s linear 
model assumption. Swapping out state vectors for ones 
from ensemble members which represent crops grown in 
conditions similar to the truth system (similar historical 
water demand and availability) improved yield predictions ; 
however, updating plant states on the basis of linear corre- 
lations with soil states was inefficient, and improvements to 
estimates of profile water content did not translate into 
improved simulations of future plant development. The 
useful information stored in the soil moisture state was in- 
formation about growth histories; this suggests that an 
improved understanding of the effects of weather on the 
crop growth environment might not be as important as an 
improved understanding of the effects of weather on crops 
themselves. 

[ 40 ] In energy-limited simulations, LAI was correlated 
with /3 at \p\ = 0.8 and EnKF assimilation of LAI observa- 
tions improved yield predictions. Water stress was a decor- 
relating factor between LAI and biomass because of 
differences in how leaf weight, stem weight, and reserves 
weight responded to stress. Again, in the energy-limited 
environment, SIRF assimilation of 9 1 :9 improved yield esti- 
mates while the EnKF did not. 

3.2.2. Soil Parameters and Initial Conditions 

[ 41 ] In simulations of the water-limited crop, the ability 
of the soil to infiltrate and store water was important for 
productivity. Correlations between LAI and fl were high 
(|p| = 0.918); however, assimilation of LAI did not result 
in improved yield estimates. In this case, when the filters 
increased biomass because of high LAI observations, large 
plants were essentially placed into soils that could support 
them. When the filters decreased biomass because of low 
LAI observations, the plants grew quickly because of suffi- 
cient water availability. This is an example of the limita- 
tions of data assimilation filters when model parameters are 
uncertain. Since LAI observations were not available after 
senescence (Figure 2), the updated plant simulations were 
able to respond to the new environment before grain 
growth was completed. 

[ 42 ] In contrast, correlation between 0| :9 and (3 was mod- 
erate (|p| = 0.233); however, the SIRF was able to improve 
ME scores by assimilating 9 1 :9 . In this case, since observa- 
tions of soil moisture were available through the end of the 
growing season, the SIRF was able to replace the ensemble 
of plant state vectors late in the grain development phase 
with ensemble members which had developed in conditions 
similar to the truth system. 

[ 43 ] Surface level soil moisture observations contained 
insufficient information to improve grain weight simulations 


10 of 13 



W05525 


NEARING ET AL. : ENKF AND SIRF OSSES FOR WHEAT YIELD 


W05525 


because surface soil layer parameters were independent of 
root zone parameters and the root zone largely controls 
water availability. In the low-stress simulations, soil param- 
eters did not affect grain growth. 

3.2.3. Cultivar Parameters 

[ 44 ] Error in yield estimates due to uncertainty in culti- 
var parameters was not mitigated by state updating for ei- 
ther crop using any combination of filter and observations. 
Correlations between LAI and / 3 were high, but differences 
in the kernel number and standard kernel size parameters 
decoupled (3 from grain weight. LAI and 0 observations do 
not inform these parameter values. 

3.2.4. State Perturbations Without Stage 
Timing States 

[ 45 ] When states other than stage timing variables cumu- 
lative development units and cumulative germination units 
were perturbed, water-limited LAI was correlated with (3 
(|p| = 0.845), however in the energy-limited environment, 
it was not (|p| = 0.341). In the water-limited environment, 
perturbations to the root zone soil water state partially con- 
trolled growth variations through the stress factor, and this 
resulted in correlated LAI and (3. In the energy-limited 
case, LAI and (3 were not well correlated because the state 
perturbations were independent. Thus in the water-limited 
environment, LAI assimilation resulted in improved ME 
score, and in the energy-limited environment, it did not. 

[ 46 ] Note that 9\ was not well correlated with (3 in either 
set of simulations; however, 0 1:9 was correlated in the 
water-limited environment. Again, this is because perturba- 
tions to soil water states were independent between layers 
and root zone water availability determined water stress. 
0 h 9 assimilation improved yield estimates in this case. 

3.2.5. State Perturbations Including Stage Timing 
States 

[ 47 ] When perturbations to stage timing states cumula- 
tive development units and cumulative germination units 
were included, the water-limited LAI -(3 correlation was 


reduced (|p| = 0.845 to \p\ = 0.657). Differences in growth 
stages between ensemble simulations accumulated through- 
out the growing season and caused a gradual decrease in 
cross correlation between vegetation states (not shown). In 
energy-limited simulations, the correlation increased 
slightly (|/j| = 0.341 to \~p\ = 0.452) and the SIRF was able 
to improve ME scores by assimilating LAI. In the water- 
limited case, stress controlled biomass and LAI through 
leaf development and dissimilar development stage transi- 
tions resulted in decorrelation. In the energy-limited case, 
random perturbations controlled leaf development, stem 
weight, and reserves weight independently and similar de- 
velopment stage transitions resulted in slightly higher cor- 
related vegetation states. 

3.2.6. Combined Uncertainty Sources 

[ 48 ] In a full modeling uncertainty paradigm, the EnKF 
and SIRF were unable to significantly improve ME scores 
in any case. This is a combined result of lack in state corre- 
lations caused by differences in cultivar parameters and 
differences in weather controls on biomass, LAI and grain 
weight. 

3.3. Observation Uncertainty Experiments 

[ 49 ] Results from varying error variances of synthetic 
LAI and 9\ observations (Table 7) suggest that even nearly 
perfect observations of surface level soil moisture will not 
improve single season yield estimates under reasonable 
modeling uncertainty assumptions. LAI assimilation was 
valuable in the water-limited simulations when the LAI 
observations error standard deviation was between 0.05 
and 0.30 [nr nr 2 ]. The SIRF was almost always better at 
assimilating LAI in the water-limited simulations, which is 
likely due to the highly nonlinear nature of the CC module. 

4. Discussion 

[ 50 ] The purpose of this study was to identify potential 
for state-updating data assimilation to mitigate error in yield 
estimates due to modeling uncertainty. Results show that 


Table 7. Monte Carlo Average Mean Error Scores for Combined Modeling Uncertainty Assimilations With Increasing Observation 
Error Variance (Observation Error) a 


Crop System 

Open Loop 
Mean Error 


LAI 



0i 


Observation 

Uncertainty 

Mean Error 

Observation 

Uncertainty 

Mean Error 

EnKF 

SIRF 

EnKF 

SIRF 

Water limited (Swift Current) 

718.7 

0.01 

607.6 

656.9 

0.001 

758.9 

1054.6 


731.9 

0.02 

666.8 

682.6 

0.005 

786.9 

949.1 


701.5 

0.05 

609.8 

543.8 

0.010 

750.5 

760.6 


665.0 

0.10 

597.6 

586.0 

0.015 

671.2 

664.9 


705.8 

0.15 

565.5 

494.5 

0.020 

681.7 

771.1 


681.1 

0.20 

551.7 

524.2 

0.030 

707.7 

745.1 


752.4 

0.30 

631.2 

609.0 

0.040 

732.9 

760.4 


666.7 

0.40 

559.5 

572.2 

0.050 

670.1 

658.9 

Energy limited (Rothamsted) 

1314.1 

0.01 

1394.0 

1528.6 

0.001 

1300.3 

2111.5 


1296.0 

0.02 

1321.9 

1232.0 

0.005 

1304.0 

1922.3 


1360.6 

0.05 

1377.2 

1341.2 

0.010 

1351.9 

1865.5 


1572.4 

0.10 

1626.2 

1435.2 

0.015 

1594.3 

1776.3 


1744.9 

0.15 

1785.2 

1278.0 

0.020 

1727.6 

2074.9 


1383.5 

0.20 

1397.3 

1342.1 

0.030 

1413.2 

1513.7 


1454.7 

0.30 

1345.2 

1352.5 

0.040 

1438.6 

1353.1 


1472.9 

0.40 

1460.0 

1330.3 

0.050 

1419.9 

1731.0 


a Bold values indicate a significant reduction in mean error score by assimilation (a = 0.05). 
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this approach was generally unable to improve yield esti- 
mates under realistic uncertainty scenarios. There were 
many factors which contribute to this: (1) weather inputs 
affect grain and leaf growth differently meaning that similar 
LAI outputs do not imply similar grain weight states, (2) that 
certain cultivar parameters affect grain development directly 
in a way which is independent of all other model states, (3) 
that state updating often results in plant state vectors which 
disagree with model (soil) parameters, and (4) that surface 
level soil moisture observations did not contain sufficient in- 
formation about available water and water stress. Results 
suggest that in water-limited environments, LAI assimilation 
would be more useful if observation error was lower than 
what is currently available. This is a problem because LAI 
observations will suffer from uncertainties which were not 
considered in this study, namely, spatial heterogeneity in ag- 
ricultural systems and discrepancies in spatial resolutions 
between fields and image pixels. 

[ 51 ] These findings are qualitatively similar to those 
from the case study reported by de Wit and van Diepen 
[2007], who used the EnKF to assimilate 8\ observations 
over crop land in Europe. They found improvement to win- 
ter wheat yield estimates in 33 out of 88 test regions. 
Although their real-world experiment was likely hampered 
by mismatches in spatial resolution between agricultural 
fields and remote sensing observations, as well as other 
spatial factors including crop heterogeneity, we have 
shown that these factors alone were likely not the reason 
for poor assimilation results. 

[ 52 ] This study suggests that in order to combine remote 
sensing observations with agricultural models for the pur- 
pose of estimating yield at single-season time scales, it will 
be necessary to modify our interpretation of crop develop- 
ment. Primarily, it would be interesting to investigate meth- 
ods and ancillary data necessary for correlating leaf 
development with grain development directly. It is likely 
that the utility of soil moisture observations will be limited 
to monitoring extreme events over large time scales as was 
implied by Bolten et al. [2010], or for estimating irrigation 
schedules and agricultural water use as was done by Wang 
and Cai [2007]. 

[53] Achnowledgements. This research was jointly supported by a 
grant from the NASA Terrestrial Ecology program entitled “Ecological 
and agricultural productivity forecasting using root-zone soil moisture 
products derived from the NASA SMAP mission” (principal investigator 
W. T. Crow), the NASA SMAP Science Definition Team, and grant 08- 
SMAPSDT08-0042 (principal investigator M. S. Moran). The authors 
would like to thank Cheryl Porter from the Department of Agricultural and 
Biological Engineering at the University of Florida for her help acquiring 
and managing DSSAT source code. 

References 

Arnold, C. P., and C. H. Dey (1986), Observing-systems simulation experi- 
ments: Past, present, and future, Bull. Am. Meteorol. Soc., 67(6), 687- 
695, doi: 10.1 175/1520-0477(1986)067<0687:OSSEPP>2.0.CO;2. 
Bolten, J. D., W. T. Crow, X. W. Zhan, T. J. Jackson, and C. A. Reynolds 
(2010), Evaluating the utility of remotely sensed soil moisture retrievals 
for operational agricultural drought monitoring, IEEE J. Sel. Top. Appl. 
Earth Obs. Remote Sens., 5(1), 57-66, doi: 10.1 109/JSTARS.2009. 
2037163. 

Campbell, C. A., D. R. Cameron, W. Nicholaichuk, and H. R. Davidson 
(1977a), Effects of fertilizer N and soil moisture on growth, N content, 
and moisture use by spring wheat, Can. J. Soil Sci., 57(3), 289-310, 
doi: 10.4141/cjss77-036. 


Campbell, C. A., H. R. Davidson, and F. G. Warder (1977b), Effects of 
fertilizer N and soil moisture on yield, yield components, protein content 
and N accumulation in aboveground pars of spring wheat, Can. J. Soil 
Sci., 57(3), 31 1-327, doi: 10.4141/cjss77-035. 

Cosby, B. J., G. M. Homberger, R. B. Clapp, and T. R. Ginn (1984), A 
statistical exploration of the relationships of soil moisture characteristics 
to the physical properties of soils, Water Resour. Res., 20(6), 682-690, 
doi: 10.1 029/WR020i006p00682. 

Crow, W. T., R. D. Koster, R. H. Reichle, and H. O. Sharif (2005), Rele- 
vance of time-varying and time-invariant retrieval error sources on the 
utility of spacebome soil moisture products, Geophys. Res. Lett., 32, 
L24405, doi : 10. 1029/2005GL024889. 

de Wit, A. M., and C. A. van Diepen (2007), Crop model data assimilation 
with the ensemble Kalman filter for improving regional crop yield fore- 
casts, Agric. For. Meteorol., 746(1-2), 38-56, doi: 10.1016/j.agrformet. 
2007.05.004. 

Entekhabi, D., et al. (2010), The soil moisture active passive (SMAP) mis- 
sion, Proc. IEEE, 98(5), 704-716, doi: 10.1007/sl0236-003-0036-9. 

Evensen, G. (2003), The ensemble Kalman filter: Theoretical formulation 
and practical implementation, Ocean Dyn., 53, 343-367, doi: 10. 1007/ 
si 0236-003-0036-9. 

Gordon, N. J., D. J. Salmond, and A. F. M. Smith (1993), Novel 
approach to nonlinear non-Gaussian Bayesian state estimation, IEE 
Proc., Part F, Radar Signal Process., 140(2), 107-113, doi: 10. 1049/ 
ip-f-2. 1993.0015. 

Hoogenboom, G., et al. (2004), Decision support system for agrotechnology 
transfer version 4.0, technical report, Univ. of Hawaii, Honolulu. 

Houtekamer, P. L., and H. L. Mitchell (2001), A sequential ensemble Kalman 
filter for atmospheric data assimilation, Mon. Weather Rev., 129(\), 123— 
137, doi: 10.1 1 75/1520-0493(200 1)129<0123 :ASEKFF>2.0.CO;2. 

Jackson, T. J., and T. J. Schmugge (1991), Vegetation effects on the 
microwave emission of soils, Remote Sens. Environ., 36(3), 203-212, 
doi : 1 0. 1 0 1 6/0034-4257(9 1 )90057-D. 

Jones, J. W., G. Hoogenboom, C. H. Porter, K. J. Boote, W. D. Batchelor, 
L. A. Hunt, P. W. Wilkens, U. Singh, A. J. Gijsman, and J. T. Ritchie 
(2003), The DSSAT cropping system model, Eur. J. Agron., 18(3-4), 
23 5-265 , doi : 1 0 . 1 0 1 6/S 1 1 6 1-030 1 (02)00 1 07-7. 

Kerr, Y. H., et al. (2010), The SMOS mission: New tool for monitoring key 
elements of the global water cycle, Proc. IEEE, 98(5), 666-687, 
doi: 10.1 109/jproc.20 10.2043032. 

Kirdiashev, K., A. Chukhlantsev, and A. Shutko (1979), Microwave radia- 
tion of the Earth’s surface in the presence of vegetation cover, Radiotekh. 
Elektron. (Moscow, Russ. Fed.), 24, 256-264. 

Knyazikhin, Y., et al. (1999), MODIS Leaf Area Index (LAI) and Fraction 
of Photosynthetically Active Radiation Absorbed by Vegetation (FPAR) 
Product (MOD15) Algorithm Theoretical Basis Document, NASA God- 
dard Space Flight Center, Greenbelt, MD. 

Liu, Y. Q., and H. V. Gupta (2007), Uncertainty in hydrologic modeling: 
Toward an integrated data assimilation framework, Water Resour. Res., 
43, W0740 1 , doi : 1 0. 1 029/2006WR005756. 

Malhotra, R. (1933), A contribution to the biochemistry of the wheat plant, 
J. Biochem., 18(2), 199-205. 

McLaughlin, D. (2002), An integrated approach to hydrologic data assimi- 
lation: Interpolation, smoothing, and filtering, Adv. Water Resour., 
25(8-12), 1275-1286, doi: 10.10 16/S0309-1 708(02)00055-6. 

Mo, X. G., S. X. Liu, Z. H. Lin, Y. Q. Xu, Y. Xiang, and T. R. Me Vicar 
(2005), Prediction of crop yield, water consumption and water use effi- 
ciency with a SVAT-crop growth model using remotely sensed data on 
the North China Plain, Ecol. Modell., 183(2-3), 301-322, doi: 10.1016/ 
j .ecolmodel.2004.07.032. 

Moulin, S., A. Bondeau, and R. Delecolle (1998), Combining agricultural 
crop models and satellite observations : From field to regional scales, Int. 
J. Remote Sens. , 19(6), 1 02 1-1 036, doi : 1 0. 1 080/0143 1 1 6982 15586. 

Njoku, E., T. Jackson, V. Lakshmi, T. Chan, and S. Nghiem (2003), Soil 
moisture retrieval from AMSR-E, IEEE Trans. Geosci. Remote Sens., 
41, 215-229, doi: 10.1 109/TGRS.2002.808243. 

Pauwels, V. R. N., N. E. C. Verhoest, G. J. M. De Lannoy, V. Guissard, C. 
Lucau, and P. Defoumy (2007), Optimization of a coupled hydrology- 
crop growth model through the assimilation of observed soil moisture 
and leaf area index values using an ensemble Kalman filter, Water 
Resour. Res., 43, W04421, doi: 10.1 029/2006WR004942. 

Pellenq, J., and G. Boulet (2004), A methodology to test the pertinence of 
remote-sensing data assimilation into vegetation models for water and 
energy exchange at the land surface, Agronomie, 24, 197-204, doi: 10. 
1051 /agro : 20040 1 7 . 


12 of 13 



W05525 


NEARING ET AL. : ENKF AND SIRF OSSES FOR WHEAT YIELD 


W05525 


Prevot, L., H. Chauki, D. Troufleau, M. Weiss, F. Baret, and N. Brisson 
(2003), Assimilating optical and radar data into the sties crop model for 
wheat, Agronomie, 23(4), 297-303, doi: 10. 1051 /agro: 2003 003. 

Priestley, C., and R. J. Taylor (1972), Assessment of surface heat flux and 
evaporation using large-scale parameters, Mon. Weather Rev., 100(2), 
8 1-92, doi : 1 0. 1 1 75/ 1 520-0493( 1 972) 1 00< 008 1 : OTAOSH>2.3 .CO ;2. 

Reichle, R. H. (2008), Data assimilation methods in the Earth sciences, Adv. 
Water Resour., 37(1 1), 141 1-1418, doi: 10.1 016/j.advwatres.2008.0 1.001. 

Reichle, R. H., R. D. Koster, P. Liu, S. P. P. Mahanama, E. G. Njoku, and M. 
Owe (2007), Comparison and assimilation of global soil moisture retrievals 
from the Advanced Microwave Scanning Radiometer for the Earth Observ- 
ing System (AMSR-E) and the Scanning Multichannel Microwave Radiom- 
eter (SMMR), J. Geophys. Res., 112, D09108, doi : 10. 1029/2006 JD008033. 

Reichle, R. H., S. V. Kumar, S. P. P. Mahanama, R. D. Koster, and Q. Liu 
(2010), Assimilation of satellite-derived skin temperature observations 
into land surface models, J. Hydrometeorol., 77(5), 1103-1122, 
doi : 1 0. 1 1 75/20 1 0JHM 1 262. 1 . 

Richie, J. (1998), Soil Water Balance and Plant Water Stress, pp. 41-54, 
Kluwer Acad., Dordrecht, Netherlands. 

Tan, B., J. N. Hu, P. Zhang, D. Huang, N. Shabanov, M. Weiss, Y. Knya- 
zikhin, and R. B. Myneni (2005), Validation of Moderate Resolution 


Imaging Spectroradiometer leaf area index product in croplands of 
Alpilles, France, J. Geophys. Res., 110, D01 107, doi: 10. 1029/ 
2004JD004860. 

Wang, D. B., and X. M. Cai (2007), Optimal estimation of irrigation sched- 
ule : An example of quantifying human interferences to hydrologic proc- 
esses, Adv. Water Resour., 30( 8), 1844-1857, doi: 10.1 016/j.advwatres. 
2007.02.006. 


W. T. Crow, Hydrology and Remote Sensing Laboratory, ARS, USD A, 
10300 Baltimore Ave., Bldg. 007, Rm. 104, BARC-West, Beltsville, MD 
20705, USA. (wade.crow@ars.usda.gov) 

H. V. Gupta and G. S. Nearing, Department of Hydrology and Water 
Resources, University of Arizona, Harshbarger Bldg., Tucson, AZ 85721, 
USA. (hoshin.gupta@hwr.arizona.edu; grey@email.arizona.edu) 

M. S. Moran, Southwest Watershed Research Center, ARS, USDA, 
2000 E. Allen Rd., Tucson, AZ 85719, USA. (susan.moran@ars.usda.gov) 
R. H. Reichle, NASA Goddard Space Flight Center, Bldg. 33, Rm. 
G-106, Greenbelt Rd., Greenbelt, MD 20771, USA. (rolf.reichle@nasa.gov) 
K. R. Thorp, Arid-Land Agricultural Research Center, ARS, USDA, 
21881 North Cardon Ln., Maricopa, AZ 85138, USA. (kelly.thorp@ 
ars.usda.gov) 


13 of 13 



